home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Celestin Apprentice 7
/
Apprentice-Release7.iso
/
Environments
/
PowerLisp 2.01
/
Supplemental Documentation
/
Documentation
/
Chapter 02. Data Types
< prev
next >
Wrap
Text File
|
1995-03-27
|
76KB
|
1,589 lines
Common Lisp the Language, 2nd Edition
-------------------------------------------------------------------------------
2. Data Types
Common Lisp provides a variety of types of data objects. It is important to
note that in Lisp it is data objects that are typed, not variables. Any
variable can have any Lisp object as its value. (It is possible to make an
explicit declaration that a variable will in fact take on one of only a limited
set of values. However, such a declaration may always be omitted, and the
program will still run correctly. Such a declaration merely constitutes advice
from the user that may be useful in gaining efficiency. See declare.)
In Common Lisp, a data type is a (possibly infinite) set of Lisp objects. Many
Lisp objects belong to more than one such set, and so it doesn't always make
sense to ask what is the type of an object; instead, one usually asks only
whether an object belongs to a given type. The predicate typep may be used to
ask whether an object belongs to a given type, and the function type-of returns
a type to which a given object belongs.
The data types defined in Common Lisp are arranged into a hierarchy (actually a
partial order) defined by the subset relationship. Certain sets of objects,
such as the set of numbers or the set of strings, are interesting enough to
deserve labels. Symbols are used for most such labels (here, and throughout
this book, the word ``symbol'' refers to atomic symbols, one kind of Lisp
object, elsewhere known as literal atoms). See chapter 4 for a complete
description of type specifiers.
The set of all objects is specified by the symbol t. The empty data type, which
contains no objects, is denoted by nil.
[old_change_begin]
A type called common encompasses all the data objects required by the Common
Lisp language. A Common Lisp implementation is free to provide other data types
that are not subtypes of common.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common (and the
predicate commonp) from the language, on the grounds that it has not proved to
be useful in practice and that it could be difficult to redefine in the face of
other changes to the Common Lisp type system (such as the introduction of CLOS
classes).
[change_end]
The following categories of Common Lisp objects are of particular interest:
numbers, characters, symbols, lists, arrays, structures, and functions. There
are others as well. Some of these categories have many subdivisions. There are
also standard types defined to be the union of two or more of these categories.
The categories listed above, while they are data types, are neither more nor
less ``real'' than other data types; they simply constitute a particularly
useful slice across the type hierarchy for expository purposes.
Here are brief descriptions of various Common Lisp data types. The remaining
sections of this chapter go into more detail and also describe notations for
objects of each type. Descriptions of Lisp functions that operate on data
objects of each type appear in later chapters.
* Numbers are provided in various forms and representations. Common Lisp
provides a true integer data type: any integer, positive or negative, has
in principle a representation as a Common Lisp data object, subject only
to total memory limitations (rather than machine word width). A true
rational data type is provided: the quotient of two integers, if not an
integer, is a ratio. Floating-point numbers of various ranges and
precisions are also provided, as well as Cartesian complex numbers.
* Characters represent printed glyphs such as letters or text formatting
operations. Strings are one-dimensional arrays of characters. Common Lisp
provides for a rich character set, including ways to represent characters
of various type styles.
* Symbols (sometimes called atomic symbols for emphasis or clarity) are
named data objects. Lisp provides machinery for locating a symbol object,
given its name (in the form of a string). Symbols have property lists,
which in effect allow symbols to be treated as record structures with an
extensible set of named components, each of which may be any Lisp object.
Symbols also serve to name functions and variables within programs.
* Lists are sequences represented in the form of linked cells called
conses. There is a special object (the symbol nil) that is the empty list.
All other lists are built recursively by adding a new element to the front
of an existing list. This is done by creating a new cons, which is an
object having two components called the car and the cdr. The car may hold
anything, and the cdr is made to point to the previously existing list.
(Conses may actually be used completely generally as two-element record
structures, but their most important use is to represent lists.)
* Arrays are dimensioned collections of objects. An array can have any
non-negative number of dimensions and is indexed by a sequence of
integers. A general array can have any Lisp object as a component; other
types of arrays are specialized for efficiency and can hold only certain
types of Lisp objects. It is possible for two arrays, possibly with
differing dimension information, to share the same set of elements (such
that modifying one array modifies the other also) by causing one to be
displaced to the other. One-dimensional arrays of any kind are called
vectors. One-dimensional arrays of characters are called strings.
One-dimensional arrays of bits (that is, of integers whose values are 0 or
1) are called bit-vectors.
* Hash tables provide an efficient way of mapping any Lisp object (a key)
to an associated object.
* Readtables are used to control the built-in expression parser read.
* Packages are collections of symbols that serve as name spaces. The parser
recognizes symbols by looking up character sequences in the current
package.
* Pathnames represent names of files in a fairly implementation-independent
manner. They are used to interface to the external file system.
* Streams represent sources or sinks of data, typically characters or
bytes. They are used to perform I/O, as well as for internal purposes such
as parsing strings.
* Random-states are data structures used to encapsulate the state of the
built-in random-number generator.
* Structures are user-defined record structures, objects that have named
components. The defstruct facility is used to define new structure types.
Some Common Lisp implementations may choose to implement certain
system-supplied data types, such as bignums, readtables, streams, hash
tables, and pathnames, as structures, but this fact will be invisible to
the user.
[old_change_begin]
* Functions are objects that can be invoked as procedures; these may take
arguments and return values. (All Lisp procedures can be construed to
return values and therefore every procedure is a function.) Such objects
include compiled-functions (compiled code objects). Some functions are
represented as a list whose car is a particular symbol such as lambda.
Symbols may also be used as functions.
[old_change_end]
[change_begin]
X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that symbols are not of
type function, but are automatically coerced to functions in certain situations
(see section 2.13).
X3J13 voted in June 1988 (CONDITION-SYSTEM) to adopt the Common Lisp
Condition System, thereby introducing a new category of data objects:
* Conditions are objects used to affect control flow in certain
conventional ways by means of signals and handlers that intercept those
signals. In particular, errors are signaled by raising particular
conditions, and errors may be trapped by establishing handlers for those
conditions.
X3J13 voted in June 1988 (CLOS) to adopt the Common Lisp Object System,
thereby introducing additional categories of data objects:
* Classes determine the structure and behavior of other objects, their
instances. Every Common Lisp data object belongs to some class. (In some
ways the CLOS class system is a generalization of the system of type
specifiers of the first edition of this book, but the class system
augments the type system rather than supplanting it.)
* Methods are chunks of code that operate on arguments satisfying a
particular pattern of classes. Methods are not functions; they are not
invoked directly on arguments but instead are bundled into generic
functions.
* Generic functions are functions that contain, among other information, a
set of methods. When invoked, a generic function executes a subset of its
methods. The subset chosen for execution depends in a specific way on the
classes or identities of the arguments to which it is applied.
[change_end]
These categories are not always mutually exclusive. The required relationships
among the various data types are explained in more detail in section 2.15.
-------------------------------------------------------------------------------
* Numbers
o Integers
o Ratios
o Floating-Point Numbers
o Complex Numbers
* Characters
o Standard Characters
o Line Divisions
o Non-standard Characters
o Character Attributes
o String Characters
* Symbols
* Lists and Conses
* Arrays
o Vectors
o Strings
o Bit-Vectors
* Hash Tables
* Readtables
* Packages
* Pathnames
* Streams
* Random-States
* Structures
* Functions
* Unreadable Data Objects
* Overlap, Inclusion, and Disjointness of Types
2.1. Numbers
Several kinds of numbers are defined in Common Lisp. They are divided into
integers; ratios; floating-point numbers, with names provided for up to four
different floating-point representations; and complex numbers.
[change_begin]
X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to add the type real.
The number data type encompasses all kinds of numbers. For convenience, there
are names for some subclasses of numbers as well. Integers and ratios are of
type rational. Rational numbers and floating-point numbers are of type real.
Real numbers and complex numbers are of type number.
Although the names of these types were chosen with the terminology of
mathematics in mind, the correspondences are not always exact. Integers and
ratios model the corresponding mathematical concepts directly. Numbers of type
float may be used to approximate real numbers, both rational and irrational.
The real type includes all Common Lisp numbers that represent mathematical real
numbers, though there are mathematical real numbers (irrational numbers) that
do not have an exact Common Lisp representation. Only real numbers may be
ordered using the <, >, <=, and >= functions.
-------------------------------------------------------------------------------
Compatibility note: The Fortran 77 standard defines the term real datum to mean
``a processor approximation to the value of a real number.'' In practice the
Fortran basic real type is the floating-point data type that Common Lisp calls
single-float. The Fortran double precision type is Common Lisp's double-float.
The Pascal real data type is an ``implementation-defined subset of the real
numbers.'' In practice this is usually a floating-point type, often what Common
Lisp calls double-float.
A translation of an algorithm written in Fortran or Pascal that uses real data
usually will use some appropriate precision of Common Lisp's float type. Some
algorithms may gain accuracy or flexibility by using Common Lisp's rational or
real type instead.
-------------------------------------------------------------------------------
[change_end]
-------------------------------------------------------------------------------
* Integers
* Ratios
* Floating-Point Numbers
* Complex Numbers
2.1.1. Integers
The integer data type is intended to represent mathematical integers. Unlike
most programming languages, Common Lisp in principle imposes no limit on the
magnitude of an integer; storage is automatically allocated as necessary to
represent large integers.
In every Common Lisp implementation there is a range of integers that are
represented more efficiently than others; each such integer is called a fixnum,
and an integer that is not a fixnum is called a bignum. Common Lisp is designed
to hide this distinction as much as possible; the distinction between fixnums
and bignums is visible to the user in only a few places where the efficiency of
representation is important. Exactly which integers are fixnums is
implementation-dependent; typically they will be those integers in the range
to , inclusive, for some n not less than 15. See most-positive-fixnum and
most-negative-fixnum.
[change_begin]
X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that fixnum must
be a supertype of the type (signed-byte 16), and additionally that the value of
array-dimension-limit must be a fixnum (implying that the implementor should
choose the range of fixnums to be large enough to accommodate the largest size
of array to be supported).
-------------------------------------------------------------------------------
Rationale: This specification allows programmers to declare variables in
portable code to be of type fixnum for efficiency. Fixnums are guaranteed to
encompass at least the set of 16-bit signed integers (compare this to the data
type short int in the C programming language). In addition, any valid array
index must be a fixnum, and therefore variables used to hold array indices
(such as a dotimes variable) may be declared fixnum in portable code.
-------------------------------------------------------------------------------
[change_end]
Integers are ordinarily written in decimal notation, as a sequence of decimal
digits, optionally preceded by a sign and optionally followed by a decimal
point. For example:
0 ;Zero
-0 ;This always means the same as 0
+6 ;The first perfect number
28 ;The second perfect number
1024. ;Two to the tenth power
-1 ;
15511210043330985984000000. ;25 factorial (25!), probably a bignum
-------------------------------------------------------------------------------
Compatibility note: MacLisp and Lisp Machine Lisp normally assume that integers
are written in octal (radix-8) notation unless a decimal point is present.
Interlisp assumes integers are written in decimal notation and uses a trailing
Q to indicate octal radix; however, a decimal point, even in trailing position,
always indicates a floating-point number. This is of course consistent with
Fortran. Ada does not permit trailing decimal points but instead requires them
to be embedded. In Common Lisp, integers written as described above are always
construed to be in decimal notation, whether or not the decimal point is
present; allowing the decimal point to be present permits compatibility with
MacLisp.
-------------------------------------------------------------------------------
Integers may be notated in radices other than ten. The notation
#nnrddddd or #nnRddddd
means the integer in radix-nn notation denoted by the digits ddddd. More
precisely, one may write #, a non-empty sequence of decimal digits representing
an unsigned decimal integer n, r (or R), an optional sign, and a sequence of
radix-n digits, to indicate an integer written in radix n (which must be
between 2 and 36, inclusive). Only legal digits for the specified radix may be
used; for example, an octal number may contain only the digits 0 through 7. For
digits above 9, letters of the alphabet of either case may be used in order.
Binary, octal, and hexadecimal radices are useful enough to warrant the special
abbreviations #b for #2r, #o for #8r, and #x for #16r. For example:
#2r11010101 ;Another way of writing 213 decimal
#b11010101 ;Ditto
#b+11010101 ;Ditto
#o325 ;Ditto, in octal radix
#xD5 ;Ditto, in hexadecimal radix
#16r+D5 ;Ditto
#o-300 ;Decimal -192, written in base 8
#3r-21010 ;Same thing in base 3
#25R-7H ;Same thing in base 25
#xACCEDED ;181202413, in hexadecimal radix
2.1.2. Ratios
A ratio is a number representing the mathematical ratio of two integers.
Integers and ratios collectively constitute the type rational. The canonical
representation of a rational number is as an integer if its value is integral,
and otherwise as the ratio of two integers, the numerator and denominator,
whose greatest common divisor is 1, and of which the denominator is positive
(and in fact greater than 1, or else the value would be integral). A ratio is
notated with / as a separator, thus: 3/5. It is possible to notate ratios in
non-canonical (unreduced) forms, such as 4/6, but the Lisp function prin1
always prints the canonical form for a ratio.
If any computation produces a result that is a ratio of two integers such that
the denominator evenly divides the numerator, then the result is immediately
converted to the equivalent integer. This is called the rule of rational
canonicalization.
Rational numbers may be written as the possibly signed quotient of decimal
numerals: an optional sign followed by two non-empty sequences of digits
separated by a /. This syntax may be described as follows:
ratio ::= [sign] {digit}+ / {digit}+
The second sequence may not consist entirely of zeros. For example:
2/3 ;This is in canonical form
4/6 ;A non-canonical form for the same number
-17/23 ;A not very interesting ratio
-30517578125/32768 ;This is
10/5 ;The canonical form for this is 2
To notate rational numbers in radices other than ten, one uses the same radix
specifiers (one of #nnR, #O, #B, or #X) as for integers. For example:
#o-101/75 ;Octal notation for -65/61
#3r120/21 ;Ternary notation for 15/7
#Xbc/ad ;Hexadecimal notation for 188/173
#xFADED/FACADE ;Hexadecimal notation for 1027565/16435934
2.1.3. Floating-Point Numbers
Common Lisp allows an implementation to provide one or more kinds of
floating-point number, which collectively make up the type float. Now a
floating-point number is a (mathematical) rational number of the form , where
s is +1 or -1, the sign; b is an integer greater than 1, the base or radix of
the representation; p is a positive integer, the precision (in base-b digits)
of the floating-point number; f is a positive integer between and
(inclusive), the significand; and e is an integer, the exponent. The value of p
and the range of e depends on the implementation and on the type of
floating-point number within that implementation. In addition, there is a
floating-point zero; depending on the implementation, there may also be a
``minus zero.'' If there is no minus zero, then 0.0 and -0.0 are both
interpreted as simply a floating-point zero.
-------------------------------------------------------------------------------
Implementation note: The form of the above description should not be construed
to require the internal representation to be in sign-magnitude form.
Two's-complement and other representations are also acceptable. Note that the
radix of the internal representation may be other than 2, as on the IBM 360 and
370, which use radix 16; see float-radix.
-------------------------------------------------------------------------------
Floating-point numbers may be provided in a variety of precisions and sizes,
depending on the implementation. High-quality floating-point software tends to
depend critically on the precise nature of the floating-point arithmetic and so
may not always be completely portable. As an aid in writing programs that are
moderately portable, however, certain definitions are made here:
* A short floating-point number (type short-float) is of the representation
of smallest fixed precision provided by an implementation.
* A long floating-point number (type long-float) is of the representation
of the largest fixed precision provided by an implementation.
* Intermediate between short and long formats are two others, arbitrarily
called single and double (types single-float and double-float).
The precise definition of these categories is implementation-dependent.
However, the rough intent is that short floating-point numbers be precise to at
least four decimal places (but also have a space-efficient representation);
single floating-point numbers, to at least seven decimal places; and double
floating-point numbers, to at least fourteen decimal places. It is suggested
that the precision (measured in bits, computed as ) and the exponent size
(also measured in bits, computed as the base-2 logarithm of 1 plus the maximum
exponent value) be at least as great as the values in table 2-1.
Floating-point numbers are written in either decimal fraction or computerized
scientific notation: an optional sign, then a non-empty sequence of digits with
an embedded decimal point, then an optional decimal exponent specification. If
there is no exponent specifier, then the decimal point is required, and there
must be digits after it. The exponent specifier consists of an exponent marker,
an optional sign, and a non-empty sequence of digits. For preciseness, here is
a modified-BNF description of floating-point notation.
floating-point-number ::= [sign] {digit}* decimal-point {digit}* [exponent]
| [sign] {digit}+ [decimal-point {digit}*] exponent
sign ::= + | -
decimal-point ::= .
digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
exponent ::= exponent-marker [sign] {digit}+
exponent-marker ::= e | s | f | d | l | E | S | F | D | L
If no exponent specifier is present, or if the exponent marker e (or E) is
used, then the precise format to be used is not specified. When such a
representation is read and converted to an internal floating-point data object,
the format specified by the variable *read-default-float-format* is used; the
initial value of this variable is single-float.
The letters s, f, d, and l (or their respective uppercase equivalents)
explicitly specify the use of short, single, double, and long format,
respectively.
Examples of floating-point numbers:
0.0 ;Floating-point zero in default format
0E0 ;Also floating-point zero in default format
-.0 ;This may be a zero or a minus zero,
; depending on the implementation
0. ;The integer zero, not a floating-point zero!
0.0s0 ;A floating-point zero in short format
0s0 ;Also a floating-point zero in short format
3.1415926535897932384d0 ;A double-format approximation to
6.02E+23 ;Avogadro's number, in default format
602E+21 ;Also Avogadro's number, in default format
3.010299957f-1 ; , in single format
-0.000000001s9 ; in short format, the hard way
[change_begin]
Notice of correction. The first edition unfortunately listed an incorrect value
(3.1010299957f-1) for the base-10 logarithm of 2.
[change_end]
The internal format used for an external representation depends only on the
exponent marker and not on the number of decimal digits in the external
representation.
While Common Lisp provides terminology and notation sufficient to accommodate
four distinct floating-point formats, not all implementations will have the
means to support that many distinct formats. An implementation is therefore
permitted to provide fewer than four distinct internal floating-point formats,
in which case at least one of them will be ``shared'' by more than one of the
external format names short, single, double, and long according to the
following rules:
* If one internal format is provided, then it is considered to be single,
but serves also as short, double, and long. The data types short-float,
single-float, double-float, and long-float are considered to be identical.
An expression such as (eql 1.0s0 1.0d0) will be true in such an
implementation because the two numbers 1.0s0 and 1.0d0 will be converted
into the same internal format and therefore be considered to have the same
data type, despite the differing external syntax. Similarly, (typep 1.0L0
'short-float) will be true in such an implementation. For output purposes
all floating-point numbers are assumed to be of single format and thus
will print using the exponent letter E or F.
* If two internal formats are provided, then either of two correspondences
may be used, depending on which is the more appropriate:
o One format is short; the other is single and serves also as double
and long. The data types single-float, double-float, and long-float
are considered to be identical, but short-float is distinct. An
expression such as (eql 1.0s0 1.0d0) will be false, but (eql 1.0f0
1.0d0) will be true. Similarly, (typep 1.0L0 'short-float) will be
false, but (typep 1.0L0 'single-float) will be true. For output
purposes all floating-point numbers are assumed to be of short or
single format.
o One format is single and serves also as short; the other is double
and serves also as long. The data types short-float and single-float
are considered to be identical, and the data types double-float and
long-float are considered to be identical. An expression such as (eql
1.0s0 1.0d0) will be false, as will (eql 1.0f0 1.0d0); but (eql 1.0d0
1.0L0) will be true. Similarly, (typep 1.0L0 'short-float) will be
false, but (typep 1.0L0 'double-float) will be true. For output
purposes all floating-point numbers are assumed to be of single or
double format.
* If three internal formats are provided, then either of two
correspondences may be used, depending on which is the more appropriate:
o One format is short; another format is single; and the third format
is double and serves also as long. Similar constraints apply.
o One format is single and serves also as short; another is double;
and the third format is long.
-------------------------------------------------------------------------------
Implementation note: It is recommended that an implementation provide as many
distinct floating-point formats as feasible, using table 2-1 as a guideline.
Ideally, short-format floating-point numbers should have an ``immediate''
representation that does not require heap allocation; single-format
floating-point numbers should approximate IEEE proposed standard single-format
floating-point numbers; and double-format floating-point numbers should
approximate IEEE proposed standard double-format floating-point numbers
[23,17,16].
-------------------------------------------------------------------------------
2.1.4. Complex Numbers
Complex numbers (type complex) are represented in Cartesian form, with a real
part and an imaginary part, each of which is a non-complex number (integer,
ratio, or floating-point number). It should be emphasized that the parts of a
complex number are not necessarily floating-point numbers; in this, Common Lisp
is like PL/I and differs from Fortran. However, both parts must be of the same
type: either both are rational, or both are of the same floating-point format.
Complex numbers may be notated by writing the characters #C followed by a list
of the real and imaginary parts. If the two parts as notated are not of the
same type, then they are converted according to the rules of floating-point
contagion as described in chapter 12. (Indeed, #C(a b) is equivalent to
#,(complex a b); see the description of the function complex.) For example:
#C(3.0s1 2.0s-1) ;Real and imaginary parts are short format
#C(5 -3) ;A Gaussian integer
#C(5/3 7.0) ;Will be converted internally to #C(1.66666 7.0)
#C(0 1) ;The imaginary unit, that is, i
The type of a specific complex number is indicated by a list of the word
complex and the type of the components; for example, a specialized
representation for complex numbers with short floating-point parts would be of
type (complex short-float). The type complex encompasses all complex
representations.
A complex number of type (complex rational), that is, one whose components are
rational, can never have a zero imaginary part. If the result of a computation
would be a complex rational with a zero imaginary part, the result is
immediately converted to a non-complex rational number by taking the real part.
This is called the rule of complex canonicalization. This rule does not apply
to floating-point complex numbers; #C(5.0 0.0) and 5.0 are different.
2.2. Characters
Characters are represented as data objects of type character.
[old_change_begin]
There are two subtypes of interest, called standard-char and string-char.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
string-char.
[change_end]
A character object can be notated by writing #\ followed by the character
itself. For example, #\g means the character object for a lowercase g. This
works well enough for printing characters. Non-printing characters have names,
and can be notated by writing #\ and then the name; for example, #\Space (or
#\SPACE or #\space or #\sPaCE) means the space character. The syntax for
character names after #\ is the same as that for symbols. However, only
character names that are known to the particular implementation may be used.
-------------------------------------------------------------------------------
* Standard Characters
* Line Divisions
* Non-standard Characters
* Character Attributes
* String Characters
-------------------------------------------------------------------------------
2.2.1. Standard Characters
Common Lisp defines a standard character set (subtype standard-char) for two
purposes. Common Lisp programs that are written in the standard character set
can be read by any Common Lisp implementation; and Common Lisp programs that
use only standard characters as data objects are most likely to be portable.
The Common Lisp character set consists of a space character #\Space, a newline
character #\Newline, and the following ninety-four non-blank printing
characters or their equivalents:
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
The Common Lisp standard character set is apparently equivalent to the
ninety-five standard ASCII printing characters plus a newline character.
Nevertheless, Common Lisp is designed to be relatively independent of the ASCII
character encoding. For example, the collating sequence is not specified except
to say that digits must be properly ordered, the uppercase letters must be
properly ordered, and the lowercase letters must be properly ordered (see char<
for a precise specification). Other character encodings, particularly EBCDIC,
should be easily accommodated (with a suitable mapping of printing characters).
Of the ninety-four non-blank printing characters, the following are used in
only limited ways in the syntax of Common Lisp programs:
[ ] { } ? ! ^ _ ~ $ %
[old_change_begin]
All of these characters except ! and _ are used within format strings as
formatting directives. Except for this, [, ], {, }, ?, and ! are not used in
Common Lisp and are reserved to the user for syntactic extensions; ^ and _ are
not yet used in Common Lisp but are part of the syntax of reserved tokens and
are reserved to implementors; ~ is not yet used in Common Lisp and is reserved
to implementors; and $ and % are normally regarded as alphabetic characters but
are not used in the names of any standard Common Lisp functions, variables, or
other entities.
[old_change_end]
[change_begin]
X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to add a format directive
~_ (see chapter 27).
[change_end]
The following characters are called semi-standard:
#\Backspace #\Tab #\Linefeed #\Page #\Return #\Rubout
Not all implementations of Common Lisp need to support them; but those
implementations that use the standard ASCII character set should support them,
treating them as corresponding respectively to the ASCII characters BS (octal
code 010), HT (011), LF (012), FF (014), CR (015), and DEL (177). These
characters are not members of the subtype standard-char unless synonymous with
one of the standard characters specified above. For example, in a given
implementation it might be sensible for the implementor to define #\Linefeed or
#\Return to be synonymous with #\Newline, or #\Tab to be synonymous with
#\Space.
2.2.2. Line Divisions
The treatment of line divisions is one of the most difficult issues in
designing portable software, simply because there is so little agreement among
operating systems. Some use a single character to delimit lines; the
recommended ASCII character for this purpose is the line feed character LF
(also called the new line character, NL), but some systems use the carriage
return character CR. Much more common is the two-character sequence CR followed
by LF. Frequently line divisions have no representation as a character but are
implicit in the structuring of a file into records, each record containing a
line of text. A deck of punched cards has this structure, for example.
Common Lisp provides an abstract interface by requiring that there be a single
character, #\Newline, that within the language serves as a line delimiter. (The
language C has a similar requirement.) An implementation of Common Lisp must
translate between this internal single-character representation and whatever
external representation(s) may be used.
-------------------------------------------------------------------------------
Implementation note: How the character called #\Newline is represented
internally is not specified here, but it is strongly suggested that the ASCII
LF character be used in Common Lisp implementations that use the ASCII
character encoding. The ASCII CR character is a workable, but in most cases
inferior, alternative.
-------------------------------------------------------------------------------
[change_begin]
When the first edition was written it was not yet clear that UNIX would become
so widely accepted. The decision to represent the line delimiter as a single
character has proved to be a good one.
[change_end]
The requirement that a line division be represented as a single character has
certain consequences. A character string written in the middle of a program in
such a way as to span more than one line must contain exactly one character to
represent each line division. Consider this code fragment:
(setq a-string "This string
contains
forty-two characters.")
Between g and c there must be exactly one character, #\Newline; a two-character
sequence, such as #\Return and then #\Newline, is not acceptable, nor is the
absence of a character. The same is true between s and f.
When the character #\Newline is written to an output file, the Common Lisp
implementation must take the appropriate action to produce a line division.
This might involve writing out a record or translating #\Newline to a CR/LF
sequence.
-------------------------------------------------------------------------------
Implementation note: If an implementation uses the ASCII character encoding,
uses the CR/LF sequence externally to delimit lines, uses LF to represent
#\Newline internally, and supports #\Return as a data object corresponding to
the ASCII character CR, the question arises as to what action to take when the
program writes out #\Return followed by #\Newline. It should first be noted
that #\Return is not a standard Common Lisp character, and the action to be
taken when #\Return is written out is therefore not defined by the Common Lisp
language. A plausible approach is to buffer the #\Return character and suppress
it if and only if the next character is #\Newline (the net effect is to
generate a CR/LF sequence). Another plausible approach is simply to ignore the
difficulty and declare that writing #\Return and then #\Newline results in the
sequence CR/CR/LF in the output.
-------------------------------------------------------------------------------
2.2.3. Non-standard Characters
Any implementation may provide additional characters, whether printing
characters or named characters. Some plausible examples:
#\ #\ #\Break #\Home-Up #\Escape
The use of such characters may render Common Lisp programs non-portable.
[old_change_begin]
2.2.4. Character Attributes
Every object of type character has three attributes: code, bits, and font. The
code attribute is intended to distinguish among the printed glyphs and
formatting functions for characters; it is a numerical encoding of the
character proper. The bits attribute allows extra flags to be associated with a
character. The font attribute permits a specification of the style of the
glyphs (such as italics). Each of these attributes may be understood to be a
non-negative integer.
The font attribute may be notated in unsigned decimal notation between the #
and the \. For example, #3\a means the letter a in font 3. This might mean the
same thing as #\ if font 3 were used to represent Greek letters. Note that
not all Common Lisp implementations provide for non-zero font attributes; see
char-font-limit.
The bits attribute may be notated by preceding the name of the character by the
names or initials of the bits, separated by hyphens. The character itself may
be written instead of the name, preceded if necessary by \. For example:
#\Control-Meta-Return #\Meta-Control-Q
#\Hyper-Space #\Meta-\a
#\Control-A #\Meta-Hyper-\:
#\C-M-Return #\Hyper-\
Note that not all Common Lisp implementations provide for non-zero bits
attributes; see char-bits-limit.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits
and font attributes with that of implementation-defined attributes.
[change_end]
[old_change_begin]
2.2.5. String Characters
Any character whose bits and font attributes are zero may be contained in
strings. All such characters together constitute a subtype of the characters;
this subtype is called string-char.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
string-char. Two new subtypes of character are base-character, defined to be
equivalent to the result of the function call
(upgraded-array-element-type 'standard-char)
and extended-character, defined to be equivalent to the type specifier
(and character (not base-character))
An implementation may support additional subtypes of character that may or may
not be supertypes of base-character. In addition, an implementation may define
base-character to be equivalent to character. The choice of any base characters
that are not standard characters is implementation-defined. Only base
characters can be elements of a base string. No upper bound is specified for
the number of distinct characters of type base-character-that is
implementation-dependent-but the lower bound is 96, the number of standard
Common Lisp characters.
[change_end]
2.3. Symbols
Symbols are Lisp data objects that serve several purposes and have several
interesting characteristics. Every object of type symbol has a name, called its
print name. Given a symbol, one can obtain its name in the form of a string.
Conversely, given the name of a symbol as a string, one can obtain the symbol
itself. (More precisely, symbols are organized into packages, and all the
symbols in a package are uniquely identified by name. See chapter 11.)
Symbols have a component called the property list, or plist. By convention this
is always a list whose even-numbered components (calling the first component
zero) are symbols, here functioning as property names, and whose odd-numbered
components are associated property values. Functions are provided for
manipulating this property list; in effect, these allow a symbol to be treated
as an extensible record structure.
Symbols are also used to represent certain kinds of variables in Lisp programs,
and there are functions for dealing with the values associated with symbols in
this role.
A symbol can be notated simply by writing its name. If its name is not empty,
and if the name consists only of uppercase alphabetic, numeric, or certain
pseudo-alphabetic special characters (but not delimiter characters such as
parentheses or space), and if the name of the symbol cannot be mistaken for a
number, then the symbol can be notated by the sequence of characters in its
name. Any uppercase letters that appear in the (internal) name may be written
in either case in the external notation (more on this below). For example:
FROBBOZ ;The symbol whose name is FROBBOZ
frobboz ;Another way to notate the same symbol
fRObBoz ;Yet another way to notate it
unwind-protect ;A symbol with a - in its name
+$ ;The symbol named +$
1+ ;The symbol named 1+
+1 ;This is the integer 1, not a symbol
pascal_style ;This symbol has an underscore in its name
b^2-4*a*c ;This is a single symbol!
; It has several special characters in its name
file.rel.43 ;This symbol has periods in its name
/usr/games/zork ;This symbol has slashes in its name
In addition to letters and numbers, the following characters are normally
considered to be alphabetic for the purposes of notating symbols:
+ - * / @ $ % ^ & _ = < > ~ .
Some of these characters have conventional purposes for naming things; for
example, symbols that name special variables generally have names beginning and
ending with *. The last character listed above, the period, is considered
alphabetic provided that a token does not consist entirely of periods. A single
period standing by itself is used in the notation of conses and dotted lists; a
token consisting of two or more periods is syntactically illegal. (The period
also serves as the decimal point in the notation of numbers.)
The following characters are also alphabetic by default but are explicitly
reserved to the user for definition as reader macro characters (see section
22.1.3) or any other desired purpose and therefore should not be used routinely
in names of symbols:
? ! [ ] { }
A symbol may have uppercase letters, lowercase letters, or both in its print
name. However, the Lisp reader normally converts lowercase letters to the
corresponding uppercase letters when reading symbols. The net effect is that
most of the time case makes no difference when notating symbols. Case does make
a difference internally and when printing a symbol. Internally the symbols that
name all standard Common Lisp functions, variables, and keywords have uppercase
names; their names appear in lowercase in this book for readability. Typing
such names with lowercase letters works because the function read will convert
lowercase letters to the equivalent uppercase letters.
[change_begin]
X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case,
which controls whether read will alter the case of letters read as part of the
name of a symbol.
[change_end]
If a symbol cannot be simply notated by the characters of its name because the
(internal) name contains special characters or lowercase letters, then there
are two ``escape'' conventions for notating them. Writing a character before
any character causes the character to be treated itself as an ordinary
character for use in a symbol name; in particular, it suppresses internal
conversion of lowercase letters to their uppercase equivalents. If any
character in a notation is preceded by , then that notation can never be
interpreted as a number. For example:
\( ;The symbol whose name is (
\+1 ;The symbol whose name is +1
+\1 ;Also the symbol whose name is +1
\frobboz ;The symbol whose name is fROBBOZ
3.14159265\s0 ;The symbol whose name is 3.14159265s0
3.14159265\S0 ;A different symbol, whose name is 3.14159265S0
3.14159265s0 ;A short-format floating-point approximation to
APL\\360 ;The symbol whose name is APL 360
apl\\360 ;Also the symbol whose name is APL 360
\(b^2\)\ -\ 4*a*c ;The name is (B^2) - 4*A*C;
; it has parentheses and two spaces in it
\(\b^2\)\ -\ 4*\a*\c ;The name is (b^2) - 4*a*c;
; the letters are explicitly lowercase
It may be tedious to insert a \ before every delimiter character in the name of
a symbol if there are many of them. An alternative convention is to surround
the name of a symbol with vertical bars; these cause every character between
them to be taken as part of the symbol's name, as if \ had been written before
each one, excepting only | itself and \, which must nevertheless be preceded by
\. For example:
|"| ;The same as writing \"
|(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c
|frobboz| ;The name is frobboz, not FROBBOZ
|APL\360| ;The name is APL360, because the \ quotes the 3
|APL\\360| ;The name is APL\360
|apl\\360| ;The name is apl\360
|\|\|| ;Same as \|\|: the name is ||
|(B^2) - 4*A*C| ;The name is (B^2) - 4*A*C;
; it has parentheses and two spaces in it
|(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c
2.4. Lists and Conses
A cons is a record structure containing two components called the car and the
cdr. Conses are used primarily to represent lists.
A list is recursively defined to be either the empty list or a cons whose cdr
component is a list. A list is therefore a chain of conses linked by their cdr
components and terminated by nil, the empty list. The car components of the
conses are called the elements of the list. For each element of the list there
is a cons. The empty list has no elements at all.
A list is notated by writing the elements of the list in order, separated by
blank space (space, tab, or return characters) and surrounded by parentheses.
(a b c) ;A list of three symbols
(2.0s0 (a 1) #\*) ;A list of three things: a short floating-point
; number, another list, and a character object
The empty list nil therefore can be written as (), because it is a list with no
elements.
A dotted list is one whose last cons does not have nil for its cdr, rather some
other data object (which is also not a cons, or the first-mentioned cons would
not be the last cons of the list). Such a list is called ``dotted'' because of
the special notation used for it: the elements of the list are written between
parentheses as before, but after the last element and before the right
parenthesis are written a dot (surrounded by blank space) and then the cdr of
the last cons. As a special case, a single cons is notated by writing the car
and the cdr between parentheses and separated by a space-surrounded dot. For
example:
(a . 4) ;A cons whose car is a symbol
; and whose cdr is an integer
(a b c . d) ;A dotted list with three elements whose last cons
; has the symbol d in its cdr
-------------------------------------------------------------------------------
Compatibility note: In MacLisp, the dot in dotted-list notation need not be
surrounded by white space or other delimiters. The dot is required to be
delimited in Common Lisp, as in Lisp Machine Lisp.
-------------------------------------------------------------------------------
It is legitimate to write something like (a b . (c d)); this means the same as
(a b c d). The standard Lisp output routines will never print a list in the
first form, however; they will avoid dot notation wherever possible.
Often the term list is used to refer either to true lists or to dotted lists.
When the distinction is important, the term ``true list'' will be used to refer
to a list terminated by nil. Most functions advertised to operate on lists
expect to be given true lists. Throughout this book, unless otherwise
specified, it is an error to pass a dotted list to a function that is specified
to require a list as an argument.
-------------------------------------------------------------------------------
Implementation note: Implementors are encouraged to use the equivalent of the
predicate endp wherever it is necessary to test for the end of a list. Whenever
feasible, this test should explicitly signal an error if a list is found to be
terminated by a non-nil atom. However, such an explicit error signal is not
required, because some such tests occur in important loops where efficiency is
important. In such cases, the predicate atom may be used to test for the end of
the list, quietly treating any non-nil list-terminating atom as if it were nil.
-------------------------------------------------------------------------------
Sometimes the term tree is used to refer to some cons and all the other conses
transitively accessible to it through car and cdr links until non-conses are
reached; these non-conses are called the leaves of the tree.
Lists, dotted lists, and trees are not mutually exclusive data types; they are
simply useful points of view about structures of conses. There are yet other
terms, such as association list. None of these are true Lisp data types. Conses
are a data type, and nil is the sole object of type null. The Lisp data type
list is taken to mean the union of the cons and null data types, and therefore
encompasses both true lists and dotted lists.
2.5. Arrays
An array is an object with components arranged according to a Cartesian
coordinate system. In general, these components may be any Lisp data objects.
The number of dimensions of an array is called its rank (this terminology is
borrowed from APL); the rank is a non-negative integer. Likewise, each
dimension is itself a non-negative integer. The total number of elements in the
array is the product of all the dimensions.
An implementation of Common Lisp may impose a limit on the rank of an array,
but this limit may not be smaller than 7. Therefore, any Common Lisp program
may assume the use of arrays of rank 7 or less. (A program may determine the
actual limit on array ranks for a given implementation by examining the
constant array-rank-limit.)
It is permissible for a dimension to be zero. In this case, the array has no
elements, and any attempt to access an element is in error. However, other
properties of the array, such as the dimensions themselves, may be used. If the
rank is zero, then there are no dimensions, and the product of the dimensions
is then by definition 1. A zero-rank array therefore has a single element.
An array element is specified by a sequence of indices. The length of the
sequence must equal the rank of the array. Each index must be a non-negative
integer strictly less than the corresponding array dimension. Array indexing is
therefore zero-origin, not one-origin as in (the default case of) Fortran.
As an example, suppose that the variable foo names a 3-by-5 array. Then the
first index may be 0, 1, or 2, and the second index may be 0, 1, 2, 3, or 4.
One may refer to array elements using the function aref; for example, (aref foo
2 1) refers to element (2, 1) of the array. Note that aref takes a variable
number of arguments: an array, and as many indices as the array has dimensions.
A zero-rank array has no dimensions, and therefore aref would take such an
array and no indices, and return the sole element of the array.
In general, arrays can be multidimensional, can share their contents with other
array objects, and can have their size altered dynamically (either enlarging or
shrinking) after creation. A one-dimensional array may also have a fill
pointer.
Multidimensional arrays store their components in row-major order; that is,
internally a multidimensional array is stored as a one-dimensional array, with
the multidimensional index sets ordered lexicographically, last index varying
fastest. This is important in two situations: (1) when arrays with different
dimensions share their contents, and (2) when accessing very large arrays in a
virtual-memory implementation. (The first situation is a matter of semantics;
the second, a matter of efficiency.)
An array that is not displaced to another array, has no fill pointer, and is
not to have its size adjusted dynamically after creation is called a simple
array. The user may provide declarations that certain arrays will be simple.
Some implementations can handle simple arrays in an especially efficient
manner; for example, simple arrays may have a more compact representation than
non-simple arrays.
[change_begin]
X3J13 voted in June 1989 (ADJUST-ARRAY-NOT-ADJUSTABLE) to clarify that if one
or more of the :adjustable, :fill-pointer, and :displaced-to arguments is true
when make-array is called, then whether the resulting array is simple is
unspecified; but if all three arguments are false, then the resulting array is
guaranteed to be simple.
[change_end]
-------------------------------------------------------------------------------
* Vectors
* Strings
* Bit-Vectors
2.5.1. Vectors
One-dimensional arrays are called vectors in Common Lisp and constitute the
type vector (which is therefore a subtype of array). Vectors and lists are
collectively considered to be sequences. They differ in that any component of a
one-dimensional array can be accessed in constant time, whereas the average
component access time for a list is linear in the length of the list; on the
other hand, adding a new element to the front of a list takes constant time,
whereas the same operation on an array takes time linear in the length of the
array.
A general vector (a one-dimensional array that can have any data object as an
element but that has no additional paraphernalia) can be notated by notating
the components in order, separated by whitespace and surrounded by #( and ).
For example:
#(a b c) ;A vector of length 3
#() ;An empty vector
#(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47)
;A vector containing the primes below 50
Note that when the function read parses this syntax, it always constructs a
simple general vector.
-------------------------------------------------------------------------------
Rationale: Many people have suggested that brackets be used to notate vectors,
as [a b c] instead of #(a b c). This notation would be shorter, perhaps more
readable, and certainly in accord with cultural conventions in other parts of
computer science and mathematics. However, to preserve the usefulness of the
user-definable macro-character feature of the function read, it is necessary to
leave some characters to the user for this purpose. Experience in MacLisp has
shown that users, especially implementors of languages for use in artificial
intelligence research, often want to define special kinds of brackets.
Therefore Common Lisp avoids using brackets and braces for any syntactic
purpose.
-------------------------------------------------------------------------------
Implementations may provide certain specialized representations of arrays for
efficiency in the case where all the components are of the same specialized
(typically numeric) type. All implementations provide specialized arrays for
the cases when the components are characters (or rather, a special subset of
the characters); the one-dimensional instances of this specialization are
called strings. All implementations are also required to provide specialized
arrays of bits, that is, arrays of type (array bit); the one-dimensional
instances of this specialization are called bit-vectors.
2.5.2. Strings
[old_change_begin]
A string is simply a vector of characters. More precisely, a string is a
specialized vector whose elements are of type string-char.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
string-char and to redefine the type string to be the union of one or more
specialized vector types, the types of whose elements are subtypes of the type
character. Subtypes of string include simple-string, base-string, and
simple-base-string.
base-string == (vector base-character)
simple-base-string == (simple-array base-character (*))
An implementation may support other string subtypes as well. All Common Lisp
functions that operate on strings treat all strings uniformly; note, however,
that it is an error to attempt to insert an extended character into a base
string.
[change_end]
The type string is therefore a subtype of the type vector.
A string can be written as the sequence of characters contained in the string,
preceded and followed by a " (double quote) character. Any " or \ character in
the sequence must additionally have a \ character before it.
For example:
"Foo" ;A string with three characters in it
"" ;An empty string
"\"APL\\360?\" he cried." ;A string with twenty characters
"|x| = |-x|" ;A ten-character string
Notice that any vertical bar | in a string need not be preceded by a \.
Similarly, any double quote in the name of a symbol written using vertical-bar
notation need not be preceded by a \. The double-quote and vertical-bar
notations are similar but distinct: double quotes indicate a character string
containing the sequence of characters, whereas vertical bars indicate a symbol
whose name is the contained sequence of characters.
The characters contained by the double quotes, taken from left to right, occupy
locations within the string with increasing indices. The leftmost character is
string element number 0, the next one is element number 1, the next one is
element number 2, and so on.
Note that the function prin1 will print any character vector (not just a simple
one) using this syntax, but the function read will always construct a simple
string when it reads this syntax.
2.5.3. Bit-Vectors
A bit-vector can be written as the sequence of bits contained in the string,
preceded by #*; any delimiter character, such as whitespace, will terminate the
bit-vector syntax. For example:
#*10110 ;A five-bit bit-vector; bit 0 is a 1
#* ;An empty bit-vector
The bits notated following the #*, taken from left to right, occupy locations
within the bit-vector with increasing indices. The leftmost notated bit is
bit-vector element number 0, the next one is element number 1, and so on.
The function prin1 will print any bit-vector (not just a simple one) using this
syntax, but the function read will always construct a simple bit-vector when it
reads this syntax.
2.6. Hash Tables
Hash tables provide an efficient way of mapping any Lisp object (a key) to an
associated object. They are provided as primitives of Common Lisp because some
implementations may need to use internal storage management strategies that
would make it very difficult for the user to implement hash tables in a
portable fashion. Hash tables are described in chapter 16.
2.7. Readtables
A readtable is a data structure that maps characters into syntax types for the
Lisp expression parser. In particular, a readtable indicates for each character
with syntax macro character what its macro definition is. This is a mechanism
by which the user may reprogram the parser to a limited but useful extent. See
section 22.1.5.
2.8. Packages
Packages are collections of symbols that serve as name spaces. The parser
recognizes symbols by looking up character sequences in the current package.
Packages can be used to hide names internal to a module from other code.
Mechanisms are provided for exporting symbols from a given package to the
primary ``user'' package. See chapter 11.
2.9. Pathnames
Pathnames are the means by which a Common Lisp program can interface to an
external file system in a reasonably implementation-independent manner. See
section 23.1.1.
2.10. Streams
A stream is a source or sink of data, typically characters or bytes. Nearly all
functions that perform I/O do so with respect to a specified stream. The
function open takes a pathname and returns a stream connected to the file
specified by the pathname. There are a number of standard streams that are used
by default for various purposes. See chapter 21.
[change_begin]
X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type
stream: broadcast-stream, concatenated-stream, echo-stream, synonym-stream,
string-stream, file-stream, and two-way-stream are disjoint subtypes of stream.
Note particularly that a synonym stream is always and only of type
synonym-stream, regardless of the type of the stream for which it is a synonym.
[change_end]
2.11. Random-States
An object of type random-state is used to encapsulate state information used by
the pseudo-random number generator. For more information about random-state
objects, see section 12.9.
2.12. Structures
Structures are instances of user-defined data types that have a fixed number of
named components. They are analogous to records in Pascal. Structures are
declared using the defstruct construct; defstruct automatically defines access
and constructor functions for the new data type.
Different structures may print out in different ways; the definition of a
structure type may specify a print procedure to use for objects of that type
(see the :print-function option to defstruct). The default notation for
structures is
#S(structure-name
slot-name-1 slot-value-1
slot-name-2 slot-value-2
...)
where #S indicates structure syntax, structure-name is the name (a symbol) of
the structure type, each slot-name is the name (also a symbol) of a component,
and each corresponding slot-value is the representation of the Lisp object in
that slot.
2.13. Functions
[old_change_begin]
A function is anything that may be correctly given to the funcall or apply
function, and is to be executed as code when arguments are supplied.
A compiled-function is a compiled code object.
A lambda-expression (a list whose car is the symbol lambda) may serve as a
function. Depending on the implementation, it may be possible for other lists
to serve as functions. For example, an implementation might choose to represent
a ``lexical closure'' as a list whose car contains some special marker.
A symbol may serve as a function; an attempt to invoke a symbol as a function
causes the contents of the symbol's function cell to be used. See
symbol-function and defun.
The result of evaluating a function special form will always be a function.
[old_change_end]
[change_begin]
X3J13 voted in June 1988 (FUNCTION-TYPE) to revise these specifications. The
type function is to be disjoint from cons and symbol, and so a list whose car
is lambda is not, properly speaking, of type function, nor is any symbol.
However, standard Common Lisp functions that accept functional arguments will
accept a symbol or a list whose car is lambda and automatically coerce it to be
a function; such standard functions include funcall, apply, and mapcar. Such
functions do not, however, accept a lambda-expression as a functional argument;
therefore one may not write
(mapcar '(lambda (x y) (sqrt (* x y))) p q)
but instead one must write something like
(mapcar #'(lambda (x y) (sqrt (* x y))) p q)
This change makes it impermissible to represent a lexical closure as a list
whose car is some special marker.
The value of a function special form will always be of type function.
[change_end]
2.14. Unreadable Data Objects
Some objects may print in implementation-dependent ways. Such objects cannot
necessarily be reliably reconstructed from a printed representation, and so
they are usually printed in a format informative to the user but not acceptable
to the read function: #<useful information>. The Lisp reader will signal an
error on encountering #<.
As a hypothetical example, an implementation might print
#<stack-pointer si:rename-within-new-definition-maybe #o311037552>
for an implementation-specific ``internal stack pointer'' data type whose
printed representation includes the name of the type, some information about
the stack slot pointed to, and the machine address (in octal) of the stack
slot.
[change_begin]
See print-unreadable-object, a macro that prints an object using #< syntax.
[change_end]
2.15. Overlap, Inclusion, and Disjointness of Types
The Common Lisp data type hierarchy is tangled and purposely left somewhat
open-ended so that implementors may experiment with new data types as
extensions to the language. This section explicitly states all the defined
relationships between types, including subtype/supertype relationships,
disjointness, and exhaustive partitioning. The user of Common Lisp should not
depend on any relationships not explicitly stated here. For example, it is not
valid to assume that because a number is not complex and not rational that it
must be a float, because implementations are permitted to provide yet other
kinds of numbers.
First we need some terminology. If x is a supertype of y, then any object of
type y is also of type x, and y is said to be a subtype of x. If types x and y
are disjoint, then no object (in any implementation) may be both of type x and
of type y. Types through are an exhaustive union of type x if each is
a subtype of x, and any object of type x is necessarily of at least one of the
types ; through are furthermore an exhaustive partition if they are
also pairwise disjoint.
* The type t is a supertype of every type whatsoever. Every object is of
type t.
* The type nil is a subtype of every type whatsoever. No object is of type
nil.
[old_change_begin]
* The types cons, symbol, array, number, and character are pairwise
disjoint.
[old_change_end]
[change_begin]
X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to extend the
preceding paragraph as follows.
* The types cons, symbol, array, number, character, hash-table, readtable,
package, pathname, stream, random-state, and any single other type created
by defstruct or defclass are pairwise disjoint.
The wording of the first edition was intended to allow implementors to use the
defstruct facility to define the built-in types hash-table, readtable, package,
pathname, stream, random-state. The change still permits this implementation
strategy but forbids these built-in types from including, or being included in,
other types (in the sense of the defstruct :include option).
X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that the type function is
disjoint from the types cons, symbol, array, number, and character. The type
compiled-function is a subtype of function; implementations are free to define
other subtypes of function.
[change_end]
[old_change_begin]
* The types rational, float, and complex are pairwise disjoint subtypes of
number.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to rewrite the preceding item as
follows.
* The types real and complex are pairwise disjoint subtypes of number.
-------------------------------------------------------------------------------
Rationale: It might be thought that real and complex should form an exhaustive
partition of the type number. This is purposely avoided here in order to permit
compatible experimentation with extensions to the Common Lisp number system.
-------------------------------------------------------------------------------
* The types rational and float are pairwise disjoint subtypes of real.
-------------------------------------------------------------------------------
Rationale: It might be thought that rational and float should form an
exhaustive partition of the type real. This is purposely avoided here in order
to permit compatible experimentation with extensions to the Common Lisp number
system.
-------------------------------------------------------------------------------
[change_end]
* The types integer and ratio are disjoint subtypes of rational.
-------------------------------------------------------------------------------
Rationale: It might be thought that integer and ratio should form an exhaustive
partition of the type rational. This is purposely avoided here in order to
permit compatible experimentation with extensions to the Common Lisp rational
number system.
-------------------------------------------------------------------------------
[old_change_begin]
* The types fixnum and bignum are disjoint subtypes of integer.
-------------------------------------------------------------------------------
Rationale: It might be thought that fixnum and bignum should form an exhaustive
partition of the type integer. This is purposely avoided here in order to
permit compatible experimentation with extensions to the Common Lisp integer
number system, such as the idea of adding explicit representations of infinity
or of positive and negative infinity.
-------------------------------------------------------------------------------
[old_change_end]
[change_begin]
X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that the types
fixnum and bignum do in fact form an exhaustive partition of the type integer;
more precisely, they voted to specify that the type bignum is by definition
equivalent to (and integer (not fixnum)). This is consistent with the first
edition text in section 2.1.1.
I interpret this to mean that implementators could still experiment with such
extensions as adding explicit representations of infinity, but such infinities
would necessarily be of type bignum.
[change_end]
* The types short-float, single-float, double-float, and long-float are
subtypes of float. Any two of them must be either disjoint or identical;
if identical, then any other types between them in the above ordering must
also be identical to them (for example, if single-float and long-float are
identical types, then double-float must be identical to them also).
* The type null is a subtype of symbol; the only object of type null is
nil.
* The types cons and null form an exhaustive partition of the type list.
[old_change_begin]
* The type standard-char is a subtype of string-char; string-char is a
subtype of character.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
string-char. The preceding item is replaced by the following.
* The type standard-char is a subtype of base-character. The types
base-character and extended-character form an exhaustive partition of
character.
[change_end]
[old_change_begin]
* The type string is a subtype of vector, for string means (vector
string-char).
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
string-char. The preceding item is replaced by the following.
* The type string is a subtype of vector; it is the union of all types
(vector c) such that c is a subtype of character.
[change_end]
* The type bit-vector is a subtype of vector, for bit-vector means (vector
bit).
* The types (vector t), string, and bit-vector are disjoint.
* The type vector is a subtype of array; for all types x, the type (vector
x) is the same as the type (array x (*)).
* The type simple-array is a subtype of array.
[old_change_begin]
* The types simple-vector, simple-string, and simple-bit-vector are
disjoint subtypes of simple-array, for they respectively mean
(simple-array t (*)), (simple-array string-char (*)), and (simple-array
bit (*)).
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
string-char. The preceding item is replaced by the following.
* The types simple-vector, simple-string, and simple-bit-vector are
disjoint subtypes of simple-array, for they mean (simple-array t (*)), the
union of all types (simple-array c (*)) such that c is a subtype of
character, and (simple-array bit (*)), respectively.
[change_end]
* The type simple-vector is a subtype of vector and indeed is a subtype of
(vector t).
* The type simple-string is a subtype of string. (Note that although string
is a subtype of vector, simple-string is not a subtype of simple-vector.)
-------------------------------------------------------------------------------
Rationale: The hypothetical name simple-general-vector would have been more
accurate than simple-vector, but in this instance euphony and user convenience
were deemed more important to the design of Common Lisp than a rigid symmetry.
-------------------------------------------------------------------------------
* The type simple-bit-vector is a subtype of bit-vector. (Note that
although bit-vector is a subtype of vector, simple-bit-vector is not a
subtype of simple-vector.)
* The types vector and list are disjoint subtypes of sequence.
* The types random-state, readtable, package, pathname, stream, and
hash-table are pairwise disjoint.
[change_begin]
X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to make
random-state, readtable, package, pathname, stream, and hash-table pairwise
disjoint from a number of other types as well; see note above.
X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type
stream.
* The types two-way-stream, echo-stream, broadcast-stream, file-stream,
synonym-stream, string-stream, and concatenated-stream are disjoint
subtypes of stream.
[change_end]
* Any two types created by defstruct are disjoint unless one is a supertype
of the other by virtue of the :include option.
[old_change_begin]
* An exhaustive union for the type common is formed by the types cons,
symbol, (array x) where x is either t or a subtype of common, string,
fixnum, bignum, ratio, short-float, single-float, double-float,
long-float, (complex x) where x is a subtype of common, standard-char,
hash-table, readtable, package, pathname, stream, random-state, and all
types created by the user via defstruct. An implementation may not
unilaterally add subtypes to common; however, future revisions to the
Common Lisp standard may extend the definition of the common data type.
Note that a type such as number or array may or may not be a subtype of
common, depending on whether or not the given implementation has extended
the set of objects of that type.
[old_change_end]
[change_begin]
X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common from the
language.
[change_end]